Skip to content

NUTCH-3130 Address deprecated API usage across Nutch codebase and build#869

Open
lewismc wants to merge 17 commits intoapache:masterfrom
lewismc:NUTCH-3130
Open

NUTCH-3130 Address deprecated API usage across Nutch codebase and build#869
lewismc wants to merge 17 commits intoapache:masterfrom
lewismc:NUTCH-3130

Conversation

@lewismc
Copy link
Member

@lewismc lewismc commented Oct 20, 2025

Proposed fix for https://issues.apache.org/jira/browse/NUTCH-3130
This is a fairly straightforward PR. Some grunt work involved...
I took a bit of time to investigate issues with and options for removing instances of @override finalize()... this may need more investigation and certainly some testing to make sure this PR doesn't break any behavior.

@lewismc lewismc self-assigned this Oct 20, 2025

@Override
protected void finalize() throws Throwable {
shutDown();
Copy link
Member Author

@lewismc lewismc Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can simply remove the call to shutdown. I need to further investigate options and confirm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for me.

Copy link
Member Author

@lewismc lewismc Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastian-nagel the smoke tests passes https://ci-builds.apache.org/job/Nutch/job/Nutch-Smoke-Test-Single-Node-Hadoop-Cluster/49/
I'm honestly not sure how to test this further... the same goes for similar removals in this PR.


@Override
protected void finalize() {
/**
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure on this right now. I need to do some more research.

@lewismc
Copy link
Member Author

lewismc commented Oct 20, 2025

I did want to also note, there are 15 instances of the @deprecated annotation across 9 files in the codebase:
src/java/org/apache/nutch/tools/CommonCrawlFormatFactory.java - 1 instance
src/plugin/lib-http/src/test/org/apache/nutch/protocol/http/api/TestRobotRulesParser.java - 2 instances
src/test/org/apache/nutch/crawl/CrawlDbUpdateUtil.java - 3 instances
src/test/org/apache/nutch/crawl/CrawlDBTestUtil.java - 3 instances
src/java/org/apache/nutch/util/NutchJob.java - 1 instance
src/java/org/apache/nutch/protocol/RobotRulesParser.java - 1 instance
src/java/org/apache/nutch/protocol/ProtocolStatus.java - 2 instances
src/java/org/apache/nutch/net/protocols/ProtocolException.java - 1 instance
src/java/org/apache/nutch/crawl/Generator.java - 1 instance

I'm not sure whether is the correct time to be removing these methods but if we decide to keep them we should definitely augment the Javadoc to detail the replacement API usage.

Copy link
Contributor

@sebastian-nagel sebastian-nagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks, @lewismc!

Some work still to be done. Let me know whether I shall take over.


@Override
protected void finalize() throws Throwable {
shutDown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for me.

@lewismc
Copy link
Member Author

lewismc commented Dec 18, 2025

I didn't see you had responded here @sebastian-nagel . I will revisit this PR, update it and perform some more testing.
I'm thinking about profiling some crawl cycles to test removal of finalize() methods. I am not confident with those proposals right now!

@lewismc
Copy link
Member Author

lewismc commented Feb 23, 2026

I'm currently smoke testing the updated PR https://ci-builds.apache.org/job/Nutch/job/Nutch-Smoke-Test-Single-Node-Hadoop-Cluster/38
It would be fantastic if anyone else could test as well.
I will note that the current PR REMOVES runtime checks against Java 11 because the index-geoip plugin CANNOT run on Java 11. Implementing a workaround for this was getting real messy so I decided to simplify the CI workflow.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants